Spatial* and
Raster Data in RThis week tutorial will be a bit playful, as you will focus on ploting the distribution of a crypto‐zoological species – the North American Sasquatch, or Bigfoot (Imaginus magnapedum). Supposedly. Sasquatch belongs to a large primate lineage descended from the extinct Asian species Gigantopithicus blacki. Sasquatch is regularly reported in forested lands of western North America and is considered a significant indigenous American and western North American folk legend. However, the existence of this creature has never been verified with a typed specimen.
The main objective will be to generate a spatial representation of
the complete range of the species. This type of analysis is the starting
point to build a statistical model to describe the ” environmental
requirements” of a species. These models are often referred to as
“species distribution modelling” (SDM) or “ecological niche modelling”
(ENM). Regardless of the “tongue-in-cheek” nature of the organisms
evaluated, this work will provide you with an opportunity to combine
different spatial data in R.
Lozier et al. (2009) undertook a similar task. They build an SDM to show the need for careful evaluation of database records before their use in modelling, especially when the presence of cryptic species (morphologically indistinguishable biological groups that are incapable of interbreeding) is suspected, or many records are based on indirect evidence.
Spatial* objects to show
Bigfoot sightings.Importing occurrence a data.frame into R is
easy. Transforming it into a Spatial* object is also easy.
However, collecting, georeferencing, and cross-checking coordinate data
is a tedious task. Discussions about species distribution modelling
often focus on comparing modelling methods. Still, if you are dealing
with species with few and uncertain records, your focus should be on
improving the quality of the occurrence data. All modelling methods do
better if your occurrence data is unbiased, free of errors, and if you
have a relatively large number of records.
In this case, you will have a file (usually a .txt or .csv file) with point locality data. Such datasets include the known distribution of a species and hopefully some extra metadata. Such a file can be stored locally in your drive or an on-line repository (e.g., GBIF, GITHUB, or hub.arcgis.com).
At this point, you should be able to load spatial data stored locally
and from on-line repositories. As a reminder, in R, loading
online-data is easy if you have the Uniform Resource Locator (URL)
determining the location of the data set, which can be used directly as
an argument into the functions read.table() or
read.csv().
Your task:
Run the code below to see how you can load data from an online repository
# The URL is data provided by the Bigfoot Field Research Organization and stored in hub.arcgis.com
Bigfoot.URL <- "https://opendata.arcgis.com/datasets/9947fc49e6c44120b4a1b3133c073dbc_0.csv"
# Load the CSV file using the read.csv() function,
Bigfoot.dta <- read.csv(Bigfoot.URL)
#Print the first six rows of Bigfoot.dta
head(Bigfoot.dta)## X Y OBJECTID_1 FID_ FID_1 name
## 1 -121.8053 48.64056 1 0 891 1994-1997+
## 2 -121.0922 46.98333 2 1 892 Sep-83
## 3 -117.6767 46.33333 3 2 893 Jul-00
## 4 -122.1914 46.04722 4 3 894 Jul-85
## 5 -123.1967 47.56160 5 4 895 Aug-92
## 6 -122.5386 47.00440 6 5 896 Aug-00
## descriptio
## 1 Report 60: Missing Cattle and large footprints found\nClass B; Skagit County
## 2 Report 77: Couple hear vocalizations while camping at Milk Lake\nClass B; Kittitas County
## 3 Report 89: Fishermen hear large animal crashing through the brush in the Blue Mountains\nClass B; Columbia County
## 4 Report 270: Campers awakened by late-night visitor in the Indian Heaven Wilderness\nClass B; Skamania County
## 5 Report 283: Couple hear sounds and find footprints near Hoodsport\nClass B; Mason County
## 6 Report 337: Man working his land hears rock against wood sound and distant answer\nClass B; Pierce County
## timestamp_ ObjectId Lon Lat STATE_NAME STATE_FIPS Year
## 1 1994/05/13 00:00:00+00 892 -121.8053 48.64056 Washington 53 1994
## 2 1983/09/01 00:00:00+00 893 -121.0922 46.98333 Washington 53 1983
## 3 2000/07/24 00:00:00+00 894 -117.6767 46.33333 Washington 53 2000
## 4 1985/07/01 00:00:00+00 895 -122.1914 46.04722 Washington 53 1985
## 5 1992/08/29 00:00:00+00 896 -123.1967 47.56160 Washington 53 1992
## 6 2000/08/16 00:00:00+00 897 -122.5386 47.00440 Washington 53 2000
## ID sourceCountry ENRICH_FID areaType bufferUnits bufferUnitsAlias
## 1 0 US 1 RingBuffer esriMiles Miles
## 2 1 US 2 RingBuffer esriMiles Miles
## 3 2 US 3 RingBuffer esriMiles Miles
## 4 3 US 4 RingBuffer esriMiles Miles
## 5 4 US 5 RingBuffer esriMiles Miles
## 6 5 US 6 RingBuffer esriMiles Miles
## bufferRadii aggregationMethod HasData FedLandPt CritHabPt
## 1 1 BlockApportionment:Landscape.huc12 1 72.1 51.5
## 2 1 BlockApportionment:Landscape.huc12 1 91.2 19.8
## 3 1 BlockApportionment:Landscape.huc12 1 36.8 1.0
## 4 1 BlockApportionment:Landscape.huc12 1 21.3 7.7
## 5 1 BlockApportionment:Landscape.huc12 1 91.0 76.7
## 6 1 BlockApportionment:Landscape.huc12 1 39.9 40.1
## NLCDsnIcPt NLCDfrstPt NLCDssgPt NLCDAgPt NLCDWetPt NLCDDevPt
## 1 5.3 72.5 17.3 0.0 0.9 1.7
## 2 0.5 81.0 15.1 0.1 0.4 2.8
## 3 0.1 23.5 54.7 17.9 1.9 1.9
## 4 1.0 68.2 14.6 0.0 0.6 4.5
## 5 3.6 75.0 18.7 0.0 0.3 2.1
## 6 0.1 44.2 17.7 10.6 6.5 20.7
As you see from the printout on the last page (and below), there are many variables in this data set.
## X Y OBJECTID_1 FID_ FID_1 name
## 1 -121.8053 48.64056 1 0 891 1994-1997+
## 2 -121.0922 46.98333 2 1 892 Sep-83
## 3 -117.6767 46.33333 3 2 893 Jul-00
## 4 -122.1914 46.04722 4 3 894 Jul-85
## 5 -123.1967 47.56160 5 4 895 Aug-92
## 6 -122.5386 47.00440 6 5 896 Aug-00
## descriptio
## 1 Report 60: Missing Cattle and large footprints found\nClass B; Skagit County
## 2 Report 77: Couple hear vocalizations while camping at Milk Lake\nClass B; Kittitas County
## 3 Report 89: Fishermen hear large animal crashing through the brush in the Blue Mountains\nClass B; Columbia County
## 4 Report 270: Campers awakened by late-night visitor in the Indian Heaven Wilderness\nClass B; Skamania County
## 5 Report 283: Couple hear sounds and find footprints near Hoodsport\nClass B; Mason County
## 6 Report 337: Man working his land hears rock against wood sound and distant answer\nClass B; Pierce County
## timestamp_ ObjectId Lon Lat STATE_NAME STATE_FIPS Year
## 1 1994/05/13 00:00:00+00 892 -121.8053 48.64056 Washington 53 1994
## 2 1983/09/01 00:00:00+00 893 -121.0922 46.98333 Washington 53 1983
## 3 2000/07/24 00:00:00+00 894 -117.6767 46.33333 Washington 53 2000
## 4 1985/07/01 00:00:00+00 895 -122.1914 46.04722 Washington 53 1985
## 5 1992/08/29 00:00:00+00 896 -123.1967 47.56160 Washington 53 1992
## 6 2000/08/16 00:00:00+00 897 -122.5386 47.00440 Washington 53 2000
## ID sourceCountry ENRICH_FID areaType bufferUnits bufferUnitsAlias
## 1 0 US 1 RingBuffer esriMiles Miles
## 2 1 US 2 RingBuffer esriMiles Miles
## 3 2 US 3 RingBuffer esriMiles Miles
## 4 3 US 4 RingBuffer esriMiles Miles
## 5 4 US 5 RingBuffer esriMiles Miles
## 6 5 US 6 RingBuffer esriMiles Miles
## bufferRadii aggregationMethod HasData FedLandPt CritHabPt
## 1 1 BlockApportionment:Landscape.huc12 1 72.1 51.5
## 2 1 BlockApportionment:Landscape.huc12 1 91.2 19.8
## 3 1 BlockApportionment:Landscape.huc12 1 36.8 1.0
## 4 1 BlockApportionment:Landscape.huc12 1 21.3 7.7
## 5 1 BlockApportionment:Landscape.huc12 1 91.0 76.7
## 6 1 BlockApportionment:Landscape.huc12 1 39.9 40.1
## NLCDsnIcPt NLCDfrstPt NLCDssgPt NLCDAgPt NLCDWetPt NLCDDevPt
## 1 5.3 72.5 17.3 0.0 0.9 1.7
## 2 0.5 81.0 15.1 0.1 0.4 2.8
## 3 0.1 23.5 54.7 17.9 1.9 1.9
## 4 1.0 68.2 14.6 0.0 0.6 4.5
## 5 3.6 75.0 18.7 0.0 0.3 2.1
## 6 0.1 44.2 17.7 10.6 6.5 20.7
For now, you will focus on the positional information stored in the
columns X (or Lon) and Y (or
Lat). The first step to assess what this information is
showing is to display it using a scatter plot.
Your task: Using the data stored in the object
Bigfoot.dta in the memory you will: Plot a scatter-plot
showing the location (i.e., the Latitude [Y or
Lat] and Longitude [X or Lon]) of
each Bigfoot sighting.
# Plot the spatial set-up of the observations stored in Bigfoot.dta as red-filled circles.
plot(Bigfoot.dta [, c("Lon","Lat")], # Specify the object with the position information
pch = 19, # Define the type of point to plot.
col = "red" # Define colour of the points.
)This first visualisation provides a reasonable frame of reference,
but it would be better to include the shape of continental North America
for reference purposes. Many such pre-loaded maps are part of
R packages (e.g.maptools, maps,
rworldmap, ggplot2).
As a start, you will use the wrld_simpl data set part of
the maptools package.
Your task:
Explore and Plot the data in the wrld_simpl object (part
of the maptools package).
# use the function summary() to explore the information stored in the object
summary(wrld_simpl)## Object of class SpatialPolygonsDataFrame
## Coordinates:
## min max
## x -180 180.00000
## y -90 83.57027
## Is projected: FALSE
## proj4string :
## [+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs]
## Data attributes:
## FIPS ISO2 ISO3 UN NAME
## : 3 AD : 1 ABW : 1 Min. : 4.0 Aaland Islands: 1
## AC : 1 AE : 1 AFG : 1 1st Qu.:215.0 Afghanistan : 1
## AE : 1 AF : 1 AGO : 1 Median :429.0 Albania : 1
## AF : 1 AG : 1 AIA : 1 Mean :431.8 Algeria : 1
## AG : 1 AI : 1 ALA : 1 3rd Qu.:650.5 American Samoa: 1
## AJ : 1 AL : 1 ALB : 1 Max. :894.0 Andorra : 1
## (Other):238 (Other):240 (Other):240 (Other) :240
## AREA POP2005 REGION SUBREGION
## Min. : 0.0 Min. :0.000e+00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 44.5 1st Qu.:1.275e+05 1st Qu.: 2.00 1st Qu.: 14.00
## Median : 5515.5 Median :3.193e+06 Median : 19.00 Median : 30.00
## Mean : 52696.1 Mean :2.464e+07 Mean : 65.43 Mean : 54.84
## 3rd Qu.: 34708.8 3rd Qu.:1.240e+07 3rd Qu.:142.00 3rd Qu.: 61.00
## Max. :1638094.0 Max. :1.313e+09 Max. :150.00 Max. :155.00
##
## LON LAT
## Min. :-178.13 Min. :-80.4460
## 1st Qu.: -50.16 1st Qu.: -0.3025
## Median : 17.66 Median : 16.5110
## Mean : 13.28 Mean : 16.4289
## 3rd Qu.: 50.01 3rd Qu.: 39.1067
## Max. : 179.22 Max. : 78.8300
##
# Print the variable names using names()
names(wrld_simpl)## [1] "FIPS" "ISO2" "ISO3" "UN" "NAME" "AREA"
## [7] "POP2005" "REGION" "SUBREGION" "LON" "LAT"
# Look at the first five rows of the Data matrix stored in wrld_simpl using head()
head(wrld_simpl@data)## FIPS ISO2 ISO3 UN NAME AREA POP2005 REGION SUBREGION
## ATG AC AG ATG 28 Antigua and Barbuda 44 83039 19 29
## DZA AG DZ DZA 12 Algeria 238174 32854159 2 15
## AZE AJ AZ AZE 31 Azerbaijan 8260 8352021 142 145
## ALB AL AL ALB 8 Albania 2740 3153731 150 39
## ARM AM AM ARM 51 Armenia 2820 3017661 142 145
## AGO AO AO AGO 24 Angola 124670 16095214 2 17
## LON LAT
## ATG -61.783 17.078
## DZA 2.632 28.163
## AZE 47.395 40.430
## ALB 20.068 41.143
## ARM 44.563 40.534
## AGO 17.544 -12.296
# Check what type of object is wrld_simpl using class()
class(wrld_simpl)## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
# plot wrld_simpl using plot
plot(wrld_simpl)The object named wrld_simpl is a
SpatialPolygonsDataFrame with multiple attributes
associated to each of the Poligons bounded together - this
is the information you summarised with function
summary().
In this case, you are not interested in the entire globe, but rather
only the three “North American” countries: Canada, United States, and
Mexico. One way to generate a new SpatialPolygonsDataFrame
only for North American countries is Sub setting wrld_simpl
using the NAMES variable as you would do in a
data.frame.
Your task:
Subset wrld_simpl so the new
SpatialPolygonsDataFrame only has the data and spatial
presentation for Canada, United States, and Mexico.
# Define a TRUE/FALSE vector that assesses if each name in wrld_simpl NAME variables is either "Canada", "United States"or "Mexico".
is.NoAmCount <- wrld_simpl@data[, "NAME"]%in%c("Canada", "United States", "Mexico")
# Sub setting wrld_simpl using the is.NoAmCount vector.
NoAmPoly <- wrld_simpl[is.NoAmCount, ]
# Show the current names in the "reduced" SpatialPolygonsDataFrame.
NoAmPoly$NAME## [1] Canada Mexico United States
## 246 Levels: Aaland Islands Afghanistan Albania Algeria ... Zimbabwe
The summary of NoAmPoly shows multiple features, one per
country. When NoAmPoly is plotted over
Bigfoot.dta it provides spatial context to Bigfoot
sightings.
Your task:
Plot the location (i.e., the Latitude [Y or
Lat] and Longitude [X or Lon]) of
each Bigfoot sighting OVER the reference map with only
North American countries.
# Plot Bigfoot sightings as red-filled circles.
plot(Bigfoot.dta[, c("Lon", "Lat")] , # Define the object to plot.
pch = 19, # Define the marks are filled circles.
col = "red" # Define the colour of the marks
)
# add the Reference map - only North American countries.
plot(NoAmPoly , # Define the SpatialPolygonsDataFrame to plot.
add = T # Should the Geometry be added to the current plot?
)Like many other maps, the maps you have plotted so far represent a distorted view of the space. Why is that?, because the Mercator projection (the one used most often when the information is measured in Latitude and longitude) make areas near to the Earth’s polar regions (Greenland, Alaska and Antarctica, for example) look much larger than they are relative to areas nearer to the equator. This is because Mercator’s way of flattening the globe involves stretching the far northern and far southern parts of the world out until a flat, rectangular map is achieved.
The question is how to pick the “right” way to represent the area of interest. There is no clear-cut answer to this, as it depends on the mapping exercise objective. For example, do you want to focus on a small area or an entire continent?. Lucky for you, mapping agencies everywhere have often already flagged a reasonable projection for whatever region/country you are interested in, so your first step is to look for that with a quick Internet search.
A version of Albers Equal Area projection is optimised for North America’s characteristics (wider east to west than north to south extents) for the study region. If South America were the focal region, you would need a different projection optimised to represent an area larger north to south than broader east to west.
With this information, you can look for the PROJ string in either the
PRøj https://proj.org/ or
Spatial Reference https://spatialreference.org/ websites and re-project
your North America map using the spTransform() function
from the sp package.
Your task:
You will now use the spTransform() function from the
sp package to re-project the wrld_simpl the
world map to an Albers Equal Area projection for North America.
Once re-projected, plot the resulting
SpatialPolygonsDataFrame to assess how the visualisation of
wrld_simpl changed.
# Define the modified Albers Equal Area projection for North America <//epsg.io/42303.proj4>
CRS.String <- "+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_def"
# Using the spTransform function to change the projection.
wrld.AEA <- spTransform(x = wrld_simpl, # The object to "re-project".
CRSobj = CRS(CRS.String) # the new projection as a CRS object.
)
# Plot the re-projected map.
plot(wrld.AEA)You see now that the focal area of the map is now North America (it is at the centre of the figure), with other areas distorted to the point of not being recognisable in many cases (i.e., Russia or China).
Now, to accurately represent the region of interest, you can subset
wrld.AEA to only the North American Countries (Canada,
United States, and Mexico).
Your task:
Generate a project map for North America by sub setting
wrld.AEA to only have the information for Canada, United
States, and Mexico.
after this, plot your re-projected North American Map.
# Define a TRUE/FALSE vector that assesses if each name in wrld_simpl NAME variables is either "Canada", "United States"or "Mexico"
is.NoAmCount <- wrld.AEA@data[, "NAME"] %in% c("Canada", "United States", "Mexico")
# Sub setting wrld.AEA using the is.NoAmCount vector.
NoAmPoly.AEA <- wrld.AEA[is.NoAmCount, ]
# Plot the re-projected map.
plot(NoAmPoly.AEA)Alternatively, you can re-project the
SpatialPolygonsDataFrame with only North American Countries
(NoAmPoly). An example of how this can be done is
below.
# Approach two: Re-project NoAmPoly
# Using the spTransform function to change the projection.
NoAmPoly.AEA <- spTransform(x = NoAmPoly , # The object to "re-project".
CRSobj = CRS(CRS.String) # the new projection as a CRS object.
)
# Plot the re-projected map.
plot(NoAmPoly.AEA)The base map is now perfect to represent the area of interest so that
distances and areas are accurately represented. But what about
the observations? These are still in Latitude and Longitude
(only georeferenced). Also, it is still a data.frame and
not a Spatial* object.
Your task:
Using the function coordiates() create a
SpatialPointDataFrame based on the information in
Bigfoot.dta.
Here you will use two variables (Lon and
Lat) in Bigfoot.dta to define the spatial
information.
Once you are done creating this SpatialPointDataFrame,
print its’ summary.
# Copy the data.frame to another object so the original information is available after the object transformation
Bigfoot.SpaPnt <- Bigfoot.dta
# Use the coordinates function to transform Bigfoot.SpaPnt from a data.frame to a SpatialPointDataFrame
coordinates(Bigfoot.SpaPnt) <- ~ Lon + Lat # note that the transformation to a SpatialPointDataFrame is done by defining a formula were the coordinates are the predictors.
# What is the class of the object after using coordinates?
class(Bigfoot.SpaPnt)## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
# Plot the SpatialPointDataFrame..
plot(Bigfoot.SpaPnt,
pch = 19, # Define the marks are filled circles.
col = "red" # Define the colour of the marks
)You could also do the operation above by creating a
SpatialPoints object and merging a data.frame to it, or
using the SpatialPointDataFrame() function. The code below
shows how this is done.
Bigfoot.Pnt <- SpatialPointsDataFrame(coords = Bigfoot.dta[, c("Lon", "Lat")] ,# Define the coordinates for each observation
proj4string = CRS("+proj=longlat +datum=WGS84"), # Define the projection - It should be WGS84 as you have Lat/long Info.
data = Bigfoot.dta # Define the data.frame to be appended.
)
# Plot the SpatialPointDataFrame..
plot(Bigfoot.Pnt,
pch = 19, # Define the marks are filled circles.
col = "red" # Define the colour of the marks
)Once you have this SpatialPointDataFrame, plotting it
would produce a very similar map to the one created with
Bigfoot.dta, but without the coordinates and “stretched”
east-to-west. This is what you need to
Your goal now is to re-project the sightings points to the “best”
projection for the region of interest. Like before, this is using the
spTransform() function from the sp package.
But before you do this, check the Bigfoot.SpaPnt projection
using the proj4string() function. For this just run the
code below and ignore the warning.
#check the projection of `Bigfoot.SpaPnt`
proj4string(Bigfoot.SpaPnt)## [1] NA
Now that you know the map is in geographical coordinates, lets
project it to an Albers Equal Area projection. One thing before we move
on, if there was no coordinate system in the object (the case if the
object is created using coordinates()) you would need to
add a initial projection manually.
Your task:
Use the function spTransform() to re-project
Bigfoot.SpaPnt to an Albers Equal Area projection.
With that last step, you are ready to generate a nicer looking map of the region of interest and the recorded Bigfoot sightings.
Your task:
Run the code below to plot your projected Bigfoot sightings object
(Bigfoot.SpaPnt) OVER your projected map
of North American counties (NoAmPoly.AEA).
# Define the plotting space [see ?par so check what xaxs = "i" and yaxs = "i" do]
par(mar = c(2, 2, 6, 2),
xaxs = "i", yaxs = "i")
# Plot the Area of interest making the continents light-grey, the boundaries black, and the oceans light-blue.
plot(NoAmPoly.AEA , # Define the SpatialPolygonsDataFrame to plot - projected North America
col = 'lightgrey', # make the continental areas light-grey
bg = 'lightblue', # Make the oceans blue
xlim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 1]), # Set limits to zoom into the region with points
ylim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 2]), # Set limits to zoom into the region with points
main = "Bigfoot sigthings\n[1869 to 2017]" ,
cex.main = 1.2)
# Add a bounding box, some tick marks to define the latitude-longitude point
box(); axis(1, labels = NA); axis(2, labels = NA); axis(3, labels = NA); axis(4, labels = NA)
# Note that the commands above are separated with a semicolon (;) as this allows to put consecutive functions in one line.
# Plot Bigfoot sightings as a red-filled points
points(Bigfoot.SpaPnt.AEA, # Define the SpatialPointsDataFrame to plot - Projected Big foot obs
pch = 19, # Define the mark shape.
col = "red", # Define the mark colour.
cex=0.7 # Define the size of the dots.
)This map is clearly better than that first raw visualization!!!
One of the great things about R is that you can make
more than one type of visualization, particularity is you are making an
interactive document.
Below is an example of such interactive map dome with the
mapview package wich builds on leaflet.
require(mapview)
mapview(Bigfoot.SpaPnt, xcol = "Longitude", ycol = "Latitude", grid = T,cex= 2,col.regions="red")Raster* objects.Now that you know where Bigfoot has been “sighted” across North
America is time to get environmental information to define the
conditions that determine the best habitats for Bigfoot. This is the
objective of species distribution modelling (SDM). In
SDMs, environmental information is usually extracted
from raster files stored in some kind of GIS format.
Variables can include climatic, soil, terrain, vegetation, land use, and
other variables.
As a start, you will focus on climatic data for the region of interest. There are many sources of climatic information freely available to be used. Some examples are the Climatic Research Unit (CRU http://www.cru.uea.ac.uk/), Climatologies at high resolution for the Earth’s land surface areas (CHELSA https://chelsa-climate.org/), WorldClim v2. (https://www.worldclim.org/), ecoClimate (https://www.ecoclimate.org/), global climatologies for bioclimatic modelling (CliMond https://www.climond.org/) and climate re-analyses such as ERA5-Land (https://bit.ly/39kQlOL).
For simplicity, you will focus now on WorldClim climatologies and
obtain the 19-bioclimatic variables commonly used in SDMs. These can be
downloaded and stored locally. Some of these can be loaded directly from
R with the use of packages. For example, WorldClim data can
be obtained directly from R using the
getData() function of the raster package.
Climate data from CRU can be obtained directly from R using
the create_CRU_stack() function of the
getCRUCLdata package. For an example of how this can be
done see the code below.
# Now use the function getData() to download the 19-BioClimatic part of WorldClim at a 10ArcMis resolution.
WC.Bioclim <- getData(name = 'worldclim', # Define the Data set name.
res = 10, # This defines the resolution of the Imported raster (10ArcMis)
var = 'bio') # This defines that BioClimatic variables are imported
# Look at the names of the variables in WC.Bioclim.
names(WC.Bioclim)By using getData() you get a number of files that are
downloaded into a folder named wc10 in the working working
directory. These are ESRI .hdr Labelled files named bio1 to
bio19.
You will not do this, as I have downloaded it as is available in your
session. The object name is WC.Bioclim as it is a
RasterStack with 19 layers, coded Bioclim.1 to
Bioclim.19 (see https://www.WorldClim.org/data/bioclim.html for a
translation of the codes). But, how do they look? The only way to see
that is to plot the variables.
Your task:
Plot the Bioclim.1 (Annual Mean Temperature [C]) and
Bioclim.12 (Total Annual Precipitation [mm/yr]) layers of
WC.Bioclim.
The maps should be shown as two panels in the same figure.
# Define a plotting space with One-Row and Two-columns
par(mfrow = c(1, 2))
# plot Annual Mean Temperature (WorldClim temp are C*10 so you need to do some operations before plotting).
# Define a divergent blue-to-grey-to-red HCL palette with 50 values.
plot(WC.Bioclim[['Bioclim.1']]/10, # Define the Layer from WC.Bioclim to plot. I divide the value by 10 as WorldClim temp are C*10 (to avoid using Floating Numbers)
main = "Annual Mean Temperature\n[C]", # Set the Figure title
cex.main = 1.2, # Define the Title font size
col = hcl.colors(n = 50 , palette = "Blue-Red") # Define a divergent HCL palette with 50 values.
)
# plot Total Annual Precipitation
# Define a divergent red-to-grey-to-blue HCL palette with 50 values.
plot(WC.Bioclim[['Bioclim.12']], # Define the Layer from WC.Bioclim to plot.
main = "Total Annual Precipitation\n[mm/yr]", # Set the Figure title
cex.main = 1.2,# Define the Title font size
col = rev(hcl.colors(n = 50 ,palette = "Blue-Red")) # Define a divergent HCL palette with 50 values.
)For clarity, “bioclimatic” summaries represent annual trends (e.g., mean annual temperature, annual precipitation), seasonality (e.g., annual range in temperature and precipitation) and extreme or limiting environmental factors (e.g., the temperature of the coldest and warmest month, and precipitation of the wet and dry quarters).
The best way to know what information is summarised in these variables is to build them yourself, so this is your next task. Specifically, you will now create Bioclim-7 [Temperature Annual Range] as an example.
For this, you will need the monthly maximum and minimum temperatures,
which can be downloaded using the var = 'tmin' and
var = 'tmax' arguments on the getData()
function of the raster package. This data is now loaded in
the memory.
Both WC.Tmin and WC.Tmax are
RasterStacks objects with 12 bands (one for each one of the
months). The first step to build a raster with the Temperature Annual
Range (Bioclim-7) is to estimate the Maximum Temperature of Warmest
Month (Bioclim-5) and the Minimum Temperature of Coldest Month
(Bioclim-6).
These can be estimated directly on a the RasterStack
using the max() and min() functions.
Alternatively, you can use the function calc() from the
raster package. This function allows for more elaborate
calculations as you can define more complex functions with the data
within a single cell in a multi-band raster (see ?calc for
more info on the function).
Your task:
Using the monthly minimum (WC.Tmin) estimate the minimum
temperature of the coldest month (the minimum value in
WC.Tmin) and plot it. Remmebr that temperatures here are C
x 10.
#Estimate the minimum temperature of coldest month
WC.MinTCoMo <- min(WC.Tmin)
#plot the minimum temperature of coldest month
plot(WC.MinTCoMo/10)Your task:
Using the monthly maximum (WC.Tmax) estimate the maximum
temperature of the warmest month (the maximum value in
WC.Tmax) and plot it.
# Estimate the maximum temperature of warmest month
WC.MaxTWaMo <- max(WC.Tmax)
#plot the maximum temperature of coldest month
plot(WC.MaxTWaMo/10)Doing the operations above transforms WC.Tmax and
WC.Tmin from a RasterStack with values for
each month to a RasterLayer that only has the maximum or
minimum monthly temperature values (that is, a one layer raster).
With this information, it is now possible to generate an Annual
Temperature Range (Bioclim-7) RasterLayer. For this, you
will subtract the Minimum Temperature of Coldest Month (Bioclim-6;
WC.MinTCoMo) from the Maximum Temperature of the Warmest
Month Bioclim-5; WC.MaxTWaMo). This can be done using the
RasterLayers directly or the calc() function
if you bundle WC.MinTCoMo and WC.MaxTWaMo in
one RasterStack.
Your task:
Using your estimates of the maximum temperature of the warmest month
(WC.MinTCoMo) and the minimum temperature of the coldest
month (WC.MaxTWaMo), estimate the Temperature Annual
Range.
Once you have done this, plot your Temperature Annual Range estimation.
#Estimate the minimum temperature of coldest month
WC.MinTCoMo <- min(WC.Tmin)
# Estimate the maximum temperature of warmest month
WC.MaxTWaMo <- max(WC.Tmax)Now you know how these bioclimatic variables are generated. But the
process above is long and tedious, done only to understand how raster
operations work. When you face a similar task in the future, you could
use thebioclim() function for the dismo
package. I leave that task to the interested student.
Raster* objects.You have a set of predictor variables as rasters in the
WC.Bioclim object. Your next task is mapping the climatic
predictors as done with the occurrences. The first step will be cropping
the global climate maps to the region of interest (North America). You
could do this using the crop() function of the
raster package and defining the cutting region using the
Bigfoot.SpaPnt object.
Your task:
Crop the WC.Bioclim raster using the spatial extent of
your georeferenced observations (Bigfoot.SpaPnt).
After you have done this, plot the resulting
RasterLayer.
# Crop the WC.Bioclim raster using the observations spatial extent
NoAm.WC.Bioclim <- crop(x= WC.Bioclim, # Define the Raster to crop
y = Bigfoot.SpaPnt # Define the spatial object that determine the extent to be cropped
)
# Plot the cropped Raster
plot(NoAm.WC.Bioclim)Raster* objects.As you see from the plot above, the result of the crop()
function is a subset of the 19-BIOCLIM variables for continental North
America. One more thing here, the plot function in a
RasterStack only plots up to 16-layers. However, like your
first observations map, the visualisation does not look very “pretty”.
So the first thing to do is re-project these to an Albers Equal Area
projection using the projectRaster() function.
Your task:
Using the function projectRaster(), re-project your
North America cropped bioclimatic variables raster
(NoAm.WC.Bioclim) to an Albers Equal Area projection.
After you have done this, plot the resulting
RasterLayer.
# project the NoAm.WC.Bioclim raster
NoAm.WC.Bioclim.AEA <- projectRaster(from = NoAm.WC.Bioclim, # Define the Raster to be projected.
crs = CRS(CRS.String) # Define the projection as a CRS object.
)
# Plot the re-projected Raster
plot(NoAm.WC.Bioclim.AEA)Raster*
objects.Now you have a series of projected rasters that can adequately show the climatic conditions in the region of interest. You will now focus on Annual Mean Temperature (bio1) as an example of producing an “almost” publication ready map. You can do this in many (and more straightforward) ways, but here you will use a more “laborious” approach, so you know how to use alternative plotting parameters.
Your task:
Generate the best visualisation for Annual Mean Temperature (bio1)
using the projected Bioclimatic raster
(NoAm.WC.Bioclim.AEA) by taking the following steps:
Define a plotting region so that the map is bounded within a region.
Plot the raster of Annual Mean Temperature using a Yellow-to-Red colour ramp palette, setting the minimum and maximum values for which colours should be plotted to -20 and 30.
Add a bounding box and axes with only tick-marks.
Add your projected map (NoAmPoly.AEA) of the region
of interest, the projected observation points
(Bigfoot.SpaPnt.AEA), and a legend to say what the plotted
symbols are.
# Step 1: define a plotting region so that the maps do not get affected by how big is the plotting space
par(mfrow=c(1,1),
mar=c(2,2,4,2), # Defines the margins of the plotting space
fig=c(0,1,0.1,1),# this argument defines a smaller figure space
new=F) # this argument states that a blank plotting space is used
# create a new plotting space
plot.new()
# Define the "limits" of the plotting space
plot.window(xlim = extent(Bigfoot.SpaPnt.AEA)[1:2], # get the xlim form the projected Bigfoot sightings
ylim = extent(Bigfoot.SpaPnt.AEA)[3:4]) # get the ylim form the projected Bigfoot sightings)
# Step 2: plot the raster
# add the raster of Annual Mean Temperature
image(NoAm.WC.Bioclim.AEA[[1]]/10, # Define the Raster Layer to plot. You know why you divide by 10 here
col = hcl.colors(100, "YlOrRd", rev = TRUE), # Define a colour ramp palette
zlim = c(-20,30), # define the range of values to plot
add=T # Add this figure to the current plot?
)
# Add a Main title manually
mtext("Annual Mean Temperture [C]", # Define the main title text
side = 3, # Define the margin to locate the text.
cex = 1.5, # Define the text size
font = 2, # Define is the text is bold
line = 1.5 # Define the lien to add the text
)
# Step 3: Add area elements
#Add a bounding box
# Add a bounding box and the axes with ticks
box(); axis(1, labels = NA); axis(2, labels = NA); axis(3, labels = NA); axis(4, labels = NA)
# Step 4: Add other elements
# Add a map of the region of interest
# Plot the Area of interest
plot(NoAmPoly.AEA, # Define the SpatialPolygonsDataFrame to be plotted.
add=T # Add this figure to the current plot?
)
# Plot Bigfoot sightings
points(Bigfoot.SpaPnt.AEA, # Define the SpatialPointsDataFrame to plot.
pch = 10, # Define the shape of the mark.
col = "black" # Define the colour of the mark.
)
# Add a legend to say what the symbols are - Manually
legend("bottomleft",
pch=10,
legend = "Bigfoot\nsightings",
bty="n")
# Step 4: Add a colour ramp by hand
par(fig=c(0,1,0.01,0.13),# this argument defines a smaller figure space
mar=c(2,2,1,2),
new=T) # this argument states that a black plotting space is used
# Plot a matrix with all possible values
image((matrix(1:100)),
axes=F,
col = hcl.colors(100, "YlOrRd", rev = TRUE))
# Add a bounding box
box()
# Add an axis with the values
axis(1,
at = seq(0,1, length.out = 6),
labels = c(-2:3)*10)
# Add a main title.
mtext("Annual Mean Temperture [C]",
side = 3,
cex = 1,
line = 0.3)par(new=F,
fig=c(0,1,0,1)) # this argument states that a black plotting space is used Raster* object to use as
predictorsThe next step is to extract the values of the predictors at the
locations where Bigfoot has been sighted. That is, get data about the
climate that the species apparently likes. This is a very
straightforward thing to do using the extract() function
from the raster package. The question is which Bigfoot
sighting object should you use: Bigfoot.dta,
Bigfoot.SpaPnt, or Bigfoot.SpaPnt.AEA. The
answer to this is given by the reference system of
WC.Bioclim.
Your task:
Using the function projection(), print the projection
string of the WC.Bioclim raster.
# Extract the projection
projection(WC.Bioclim)## [1] "+proj=longlat +datum=WGS84 +no_defs"
As you see, this is a Mercator projection, so you could use
Bigfoot.dta or Bigfoot.SpaPnt to extract the
values for each of the sites where Bigfoot has been sighted. If you
instead used NoAm.WC.Bioclim.AEA as your source raster, you
would need to use Bigfoot.SpaPnt.AEA to get the data. With
that clear, now you will extract the environmental information.
Your task:
Using the function extract() get the values for the 19
Bioclimatic variables in WC.Bioclim for each Bigfoot
sightings. for this use the Bigfoot.SpaPnt spatial
object.
# The same as above but using the SpatialPointDataFrame
BFC <- extract(x = WC.Bioclim ,# Define the RasterStack with the Environmental information.
y = Bigfoot.SpaPnt # Define the object with the locations to get the data
)
# Print the resulting data.frame head.
head(BFC)## Bioclim.1 Bioclim.2 Bioclim.3 Bioclim.4 Bioclim.5 Bioclim.6 Bioclim.7
## [1,] 50.66667 125.8889 -18.7777786 1675.6666 255.88889 47.55556 52.11111
## [2,] 37.66667 119.5556 -35.3333321 1727.0000 285.77777 29.11111 62.44444
## [3,] 66.00000 156.5556 -20.3333340 642.2222 85.66666 18.00000 40.44444
## [4,] 73.22222 142.7778 12.0000000 2503.5557 402.44446 35.55556 62.00000
## [5,] 51.44444 116.4444 -0.3333333 2295.1111 371.77777 46.66667 59.66667
## [6,] 104.44444 168.1111 44.6666679 1236.7778 202.11111 21.55556 62.33333
## Bioclim.8 Bioclim.9 Bioclim.10 Bioclim.11 Bioclim.12 Bioclim.13 Bioclim.14
## [1,] 742.0000 168.66667 187.77777 679.0000 88.22222 34.44444 5752.222
## [2,] 826.1111 134.44444 143.77777 767.7778 91.77778 33.66667 6148.333
## [3,] 252.0000 73.44444 84.66666 233.3333 131.11111 38.22222 6936.333
## [4,] 1178.8889 176.44444 191.11111 1097.1111 100.66666 40.22222 5209.556
## [5,] 1051.5555 179.22223 200.66667 983.1111 82.44444 38.11111 4709.111
## [6,] 587.3333 91.22222 91.22222 537.8889 103.33334 42.22222 4913.778
## Bioclim.15 Bioclim.16 Bioclim.17 Bioclim.18 Bioclim.19
## [1,] 195.6667 -57.333332 253.0000 -13.888889 124.8889
## [2,] 193.5556 -74.222221 267.7778 -30.333334 117.3333
## [3,] 266.8889 -71.111115 338.0000 -14.666667 154.2222
## [4,] 222.7778 -25.666666 248.4444 16.444445 141.2222
## [5,] 180.3333 -34.222221 214.5556 3.555556 113.2222
## [6,] 244.8889 3.666667 241.2222 49.222221 168.1111
Raster* object
to use as predictorsOne important check is whether there are observations with no
environmental data (as these could be located in areas where the
RasterStack has no data). You can do this using a simple
apply loop.
Your task:
Using the function apply(), define the sightings with no
bioclimatic information (rows with NA).
# Determine the rows where the variables have NA as a value.
apply(BFC ,
MARGIN = 2, # apply the function to each column
function(x){
which(is.na(x))})# test for each column the rows with no values## integer(0)
The code above shows that rows 3595 and 3682 do not have
environmental data. To see where they are, you can plot the “nice map”
again at the end of the previous section modifying the
points() function to mark the locations with no
environmental data.
Your task:
Plot the “nice map” at the end of the previous section, modifying the
points() function to mark the locations with no
environmental data as black filled points.
#Plot the are of interest, observations and Latitudes and Longitudes lines
# define the plotting area
par(mar = c(2, 2, 6, 2), xaxs = "i", yaxs = "i")
# Plot the Area of interest
plot(NoAmPoly.AEA,
col = 'lightgrey', # make the continual areas light-grey
bg = 'lightblue', # Make the oceans blue
xlim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 1]), # Set limits to zoom into the region with points
ylim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 2]), # Set limits to zoom into the region with points
main = "Bigfoot sigthings\n[1869 to 2017]",
cex.main = 1.2)
box()
# Add some tick marks to define the latitude-longitude point
# Note that these commands are separated with a semicolon (this allows to put consecutive functions in one line)
axis(1, labels = NA); axis(2, labels = NA); axis(3, labels = NA); axis(4, labels = NA)
# Plot Bigfoot sightings
points(Bigfoot.SpaPnt.AEA ,
pch=19,
col = "red")
# Show the sites with no environmental data
points(Bigfoot.SpaPnt.AEA[c(3595, 3682), ] ,
pch = 19,
cex=1.5,
col = "black")This figure shows that a couple of points in North Carolina do not have environmental data and should be removed from the analysis.
The work you have done today is the type of work you would do in an actual project. You have explored your observations and predictors, then matched these two together, and finally generated a model to test a hypothesis and make some predictions.